DOCUMENT RESUME 



ED 398 280 



TM 025 483 



AUTHOR 

TITLE 



Fenster , Mark J* 

An Assessment of "Middle** Stakes Educational 
Accountability: The Case of Kentucky. 

Apr 96 

20p.; Paper presented at the Annual Meeting of the 
American Educational Research Association (New York, 
NY, April 8-12, 1996). 

Reports - Evaluat i ve/Feas ibi 1 i ty (142) — 
Speeches/Conference Papers (150) 



PUB DATE 
NOTE 



•UE TYPE 



-EDRS PRICE MFOl/PCOl Plus Postage. 



Academic Achievement ; *Accountabi 1 i ty; ^Achievement 



Kentucky Education Reform Act (KERA) , which mandated a total overhaul 
of the state* s kindergarten through grade 12 public school system and 
was designed to result in equitable education for all students. 
Accountabi 1 i ty components of the KERA include financial incentives 
for staff in schools where student gains are exemplary and the use of 
sanctions to cause staff in ineffective schools to boost student 
achievement gains to an acceptable level. These features are 
operat ional ized through the Kentucky Instructional Results 
Information System (KIRIS). KIRIS had to comply with the legislative 
mandates and h^d to produce assessments that would be technically 
— defensible and politically credible for **middle** stakes, rather than 
“"“iligh- or low-stakes assessment. The state*s reform efforts provide an 
opportunity to examine systemic reform in assessment and 
accountabi 1 i ty . This description of assessment development and 
implementation demonstrates that the KERA has had classroom impact. 
Change has been forced in many areas because the incentive and 
sanction provisions have made KERA and KIRIS impossible to ignore. 

Now the state* s problem is to move on to a reform effort that can 
fine tune itself without the external shocks that came with KERA 
implementation. In the current political climate, the system 
sustained on rewards and sanctions may not last until its supposed 
end date in 2012. (Contains 2 figures, 3 tables, and 20 references.) 
(SLD) 




Gains; Educational Assessment; Educational Change; 
Elementary Secondary Education; Equal Education; 
*Incentives; Political Influences; State Legislation; 
State Programs; *Test Construction; ^Testing 
Problems; Testing Programs; Test Use 



-“"IDENTIFIERS ^Kentucky Education Reform Act 1990; Kentucky 



Instructional Results Information System; *Ref orm 
Efforts 



ABSTRACT 



In 1990 the Kentucky state legislature passed the 




* 



Reproductions supplied by EDRS are the best that can be made 
from the original document. 







s 



AN ASSESSMENT OF 

"MIDDLE" STAKES EDUCATIONAL ACCOUNTABILITY; 
THE CASE OF KENTUCKY 



u s department of education 

OHiCO ol EOuCHlionai Research and Impiovemont 
EDUCATIONAL RESOURCES INFORMATION 
7 CENTER (ERIC) 

a This document has been reproduced as 
received from the person or orgarnzation 
originating it. 

□ Minor changes have been made to 
improve reproduction quality. 



• Points of view or opinions stated m this 
document do not necessarily represent 
official OERI position or policy 



PERMISSION TO REPRODUCE AND 
DISSEMINATE THIS MATERIAL 
HAS BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) 



by 



Mark J. Fenster 
Wedgewood Associates 
931 Austin St. Apt. 2 
Kalamazoo, MI 49008-1104 USA 



mfaister@edcen.dihs.cmich.edu 



Paper presented at the Annual meeting of the 1996 American Educational Research Association, 

New York City, April 12, 1996. 



ERIC 



BEST COPY AVAILABLE 

'C 



Background Information 



This papCT should be considered in the environment of educational reforms under way in 
Kentucky since 1990. The scale of these reforms is massive and unprex^ented for any state. In 

1989 the Kaitudky Supreme Court declared the Commonwealth's existing rules and procedures 
for financing schools and delivering educational services to be unconstitutional. In 19% the state 
legislature passed the Kentucky Education f.' jform Act (KERA), which mandated a total overhaul 
of the K-12 public education system and was designed to result in equitable educaticmal services to 
ail students. 

The main features of the KERA are (1) prescribed statewide academic expectations; (2) use 
of a mo^l cumculum firamework; (3) a commitment to hdping all children to become proficient in 
performing rigorous stote standards that emphasize application of what is learned; (4) heavier 
concer^tion of learning resources on studmts who are not learning up to tiheir potoitial; (5) 
CTtetiMve parental involvement; (6) site-based management of schools; (7) use of financial 
incentives to reward staff m schools where student gains are ©cemplary; and (8) use of sanctions, 
including the assistance of distinguished educators, to cause staffs in ineffective schools to bring 
student achievement gains to an accq)table level. These last two features of KERA were 
operationalized through an integrated, comprehaisive assessment system. The Kentucky 
Iristructional Results Information System (KIRIS). 

KIRIS had to comply with legislative mandates, which are evolving; had to in’ovide 
performance measures; and to produce assessments that would be technic^y defensible and 
politically credible for making "rniddle" stakes decisions on rewards and sanctions to schools. The 
Kentucky Dq)aitment of Education had to develop KIRIS a couple of years btfore the field of 
educational measuremoit updated the standards forjudging assessment systems (Linn, 1994). The 
educational measurement profession is in the process of updating its stand^ds. The current 
Sta ndards for Educational and Psychological Tests (APA, 1985) do not deal with the "middle 
stakes", school based, Kentucky assessment system very well. 

BQTond a single state case study, systemic reform initiatives have become more common 
m recent years. Kentucky has gone further in systemic reform than most other states in the United 
States. Fbr this reasoii, Kratucky is seen as a bdlwether for the country to examine the practicality 
of ^stemic reform initiatives for K-12 public education. Educational researchers can learn from 
Kentucky’s experience with systemic reform, let alone Kaituclw’s unique approach to assessmrait 
and accountability. 

Brief History of Education in Kentucky Before and After the Passage of KERA 

The Commonwealth of Kentucky used the Kentucky Essential Skills Test (KEST) in the 
middle 1980s and the Ctomiffdiaisive Test of Basic Skills (CTBS-IV) in 1988-1989 and 1989- 

1990 to assess students. The Commonwealth could take over school districts if their students did 
not perform satisfactorily on KEST. However, some people in the state thou^t there were some 
fundamental problems with an accountability system focused at the district levd. Within a district, 
schools with weak or descending test scores could be counterbalanced by other schools with 
strong and improving test scores in that district. This problem led state leaders to reconcqjtudize 
accountability so that it applies to the sdhool, rather than at the district level. 

In June of 1989, the Kaitucky Supreme Clourt ruled the public school systan in the 
Commonwealth was unconstitutional. Based on the evidence presented in Rose v. the Council for 
Better Educ^on, (1989), the court concluded that each child in the Commonwealth was ODl being 
provided with ^ ^ual opportunity to have an adequate education. The inequities between rich and 
poor school districts were too large, depriving children in poorer districts a fair and equal 



opportunity to receive an adequate education. According to the court, the responabdity for 
providing an adequate education for flU children of the Commonwealth rested with the Oeneral 
Assembly. In response to the court order, the state legislature passed the Kentucky Education 
Reform Act of 1990 (or KERA). 



KERA includes a numbo- of legislative mandates, two of which are described here, (^e 
m ^^nriatp is that a primarily p^ormance-based assessmait procedure be used. Instead of using 
only multiple-choice questions as did KEST and CTTBS-IV, KERA required the KOTtucky 
Department of location (KDE) to assess what "students could do with what th^ know. As a 
r^ of this mandate, KDE designed and has been developing the performance ass^smrat 
component of KERA. This assessmait system is named the Keaitucky Instructional Results 
Information System (KIRIS). 



KERA also mandated that the assessment system (KIRIS) must be usable for gating 
rewards to schools that have an increased proportion of successful studrats and far delivCTing 
sanctions to schools that have a deaeased proportion of successful students. As documented m 
Guskey (1994, p.81). 



The legislation requires the State Board to establish ... a threshold level for school 
improvement ... to determine the amount of success needed for a school to receive 
a reward. The threshold defimtion shall establish the percentage of increase 
required in a school's pCTcentage of successfa studrats, as compared to a school's 
present proportion of successful studoits, with consideration given to the fart that a 
school closest to having one hundred percent (100%) successful studaits will have 
a lower percentage inaease required. 



KERA further requires that school success shall be determined by measuring a school's 
improvement over a two year period. As discussed in Guskey (1994, p. 82), a school that does 
not reach its prescribed threshold level 



. . . but maintains the previous proportion of successfol students shall be required 
to develop a s^ool improvement plan and shall be eli^ble to receive frads from the 
school improvement funds pursuant to KRS 158.805. A school in which the 
proportion of successful studrtits declines by less than five pCTcent (5%) shall be 
required to develop a school improvement plan, shall be eligible to receive funds 
from the school improvement fond, and shall have one or more Kratucky 
distinguished educators assigned to the school to carry out the duties as described in 
KRS 158.782. A school in which the proportion of successful students declines by 
five percent (5%) or more shall be declared by the State Board for Elematary and 
Secondary Education to be a 'school in crisis,' and the State Board is to implement 
defined sanctions. 



The rewards and sanctions make the KERA reform a "stakes" program. If the percoitage 
of successful students inaease, employees in the school (the principal, teachers, support staff, and 
others) wae eligible for financial awards (about $1200 in the fost biennium). If the pacentage of 
successfol studaits deaeased, schools were given additional assistance in the form of 
distinguished educators. In the early years of the Kaitucky assessment program, thoe wae no 
strongly negative sanctions associated with poor performance on KIRIS. We note hae that the 
most sevae sanction, the school in crisis sanction, has not yet been implemented. Without the 
school in crisis sanction, we find it hard to classify the KIRIS assessment system as "high stakes". 
For this reason, we classify the Kentudky assessmait system as a "middle stakes" program. 




4 



- 7 ^- 



Descripti on of the KIRIS Assessment 

In order to understand some of the issues discussed in this paper, it is necessary to have a 
rudimentary understanding of the elements of the KIRIS assessment program. The puipose of this 
section IS to provide the reader with this rudimaitary information. 

a«otmtat>ility index . The basis for describing a school's accomplishment is the KIRIS 
ac^untabuity mdex. The accountability index for a school is the average performance of the 
schools students over six separate measures: five cognitive achievonent measures and one 
n on cogmtive achievemCTt measure. Each cognitive achievement measure reflects a school's 
peiiorn^t^ m one curriculum area. A school's performance on each of the six measures is also 
i^rted. In a rough (but imprecise) way, a school's score on the accountability index can 

translate mto percentage of successful students specified by KERA. 

£k? S Pitivg achievemoit at the school level . The measure of a school's accomplishment in 
rach cogmtiye achievment area is the avaage achievement score of its students. For each of the 
live cumculum areas, a student’s score is obtained as follows. As a result of several types of 
ass^mrats m an area (which are described lata*), each student is classified into one of four 
qu^ty levels: noWce, apprentice, proficient, and distinguished. Next each student is assigned 
pomts on the KIRIS score scale as shown in Table 1. ® 



^ student^ score on the cognitive dimensions has a possible range from 0 to 140 If a 
student is absent fi-om the assessment, the student is assigned a KIRIS score scale of 0. 



Relationship Between Level of Student Performance and 
Accountability Score Scale Points 



Student (Quality Level 
Novice 
Apprentice 
ftoficient 

* I t 



Corresponding Accountability 
Score Scale Points 
(5 
40 

m 





score a school may attain on the noncognitive assessment is 100. 








BEST COPY AVAIUBLE 



a school must score at the novice level and the school must have a swre of 0 on t^ wflffihe 
measure. Similarly, at the other extreme all students must s<^e at the 

school’s noncognitive measure must equal 100 (Intaest^ " nS2?W4) b 

description of the computation of the acoountabihty mdex for the first biennium (1992- 1W4) m 

Guskey [1994].) 

The desired minimum score for a school . Achieving at least the profidait 
students is a school's goal. This qualitative goal can be translated o a quantitative accountability 
index value. Since the proficient level translates to 100 on the KIRIS s^^ 
student score of 100 in each cognitive area is the desired mimmum accountability scor^ One^ the 
unstated goals of the Kentucky educational reform movemrat is that each Kentucky school is 
supposed to reach the desired minimum score of 100 in 20 or fewer years. 

How a c/'h nni r<v>i»ivfts rewards or gets sanctioned . Based on the 1^1-92 KIRIS 
assessment, schools recdved a baseline score on the ac^untabiUty 

subtracted from 100 (the desired mimmum accountability score 20 y^s after ^he st^ ot the 
program). The difference between the desired accountability score (100) and the basehne score 
wasTgap that each school had to close. The gap betw^ the desired aoxiuntabdity *e 

baseline ^re was divided by 10. The division by 10 represents the length of the progr^-10 
bienniums, or 20 years. The gap divided by 10 represented the average gam on the accountabihty 
index n^ed by a school to avoid sanctions. 

An example may clarify the previous paragraph. Assume a school received m 
accountabiUty score of 40 on the baseline (1991-92) KIRIS assessment. This Iwselme ^ 
would be subtracted from 100. The difference between the desired acwuntability score (HW) ^d 
the baseUne score (40) is a 60 point gap. This gap would be Avided by 
length of the program. In this case, the gain on the accountabihty mdex needed by tMs school to 
avoid sanctions is 6 points. The school's 1992-93 and 1993-94 KIRIS accountabihty mdex resets 
would be averaged to determine the school's accountability mdex value at the end of the first 
biennium. 

If the school in this example had a biennium accountabihty average of 46, the school would 
neither be sanaioned or rewarded. If the school's accountabihty index av^ge wai 47 or higher, 
the school would receive financial rewards. If a school's accountabihty index avmge was less 
than 40, the school would face sanctions unda the KERA legislaticm. If the school s average 
accountabihty index was between 46 and 47, the school would be cl^ifi^ as succKsful. Such a 
result would subject the school to neither rewards or sanctions. The KERA l^slation did not 
clearly define what happens to a school such as this one if its average accountobihtym^ value 
was between its basdine and its threshold (in this example, between 40 and 46). The Kentuc^ 
Department of Education (KDE) has determined that if a school does not achieve its threshold (46 
in^s case), but increases its accountabihty score, that school needs to develop a school 
improvOTient plan. 

How the Assessment Tasks Within a C ognitive Measure are Waghtgl 

Since the initial year (1991-1992) cognitive measmes were obtain^ in the cuniculuni areas 
of mathematics, reading, scioice, social studies, and writing. Each area is ass^sed by a yanety of 
formats that are weighted differently. In 1993-1994, the formats for the first bienmum were 
weighted as shown in Table 2. 




b 



BEST COPY AVAiUBLE 



Table 2: Cognitive Weights on the KIRIS Assessment 1992-1994 





Component 


Assessment Format 


Social Studies 
Sciaice 
Math 


Reading 


Writing ^ 


Hi. Open-Ended Common Questions 
(Five (^estions) 


40% 


50% 


NA 


i. ()pen-Ended Matrix-Sampled 
(Each student is randomly 
assigned to answer 2 of a 
pool of 24 questions.) 


40^ 


50% 


NA 


Performance Events 


20^ 


NA 


NA 


4. Multiple Choice Questions^ 


0% 


0% 


NA 


5. Portfolio^ 


Wo 


Wo 


RM 


6. On-Demand Writing Prompts^ 


NA 


NA 


Wo 


TOTAL 


100^ 


ioo^ 


ioo^ 



How the Componqits Within th e N(moogriitive Measures are Weighted 

The wdghts of the componaits comprising the noncognitive measures are shown in Table 



Table 3: Noncognitive Weights on the KIRIS Assessment, 1992-1994 



Component 


4th grade 


8th grade 


High School 


Attendance 


m 


40% 


2^ 


Retention 


TWo 


AWo 


5% 


Dropout 


NA 


2Wo 


37.5% 


Transition to Adult Life 


NA 


NA 


TL5% 


TOTAL 


lOWo 


ilM 


wwo 



We note here that starting with the 1994-1995 KIRIS assessment, the noncognitive 
measure will be lagged by one year. That is, the computation for the 1994-1995 school year will 
be based on a school's 1993-1994 data for the four components. This was done because tnere was 
insufficient time to collect and disseminate the data fw the 1994-1995 school year. 



' . Writing is assessed only through a portfolio. 

2. Multiple-choice tests wo-e administered, but the results were not counted in the 
accountability measure. 

3. Only mathematics was assessed by a portfolio, and the results were not counted toward 
the accountability measure. 

On-demand writing prompts were administrated, but the results were not counted in the 
accountability measure. 





/ 



Recent Changes in the Assessment 

Because the KIRIS assessment is an innovative and developing program, it is reasonable to 
expect that changes and fine-tuning will be done each year the program is in place. The following 
changes were made to the KIRIS assessment during the 1994-1995 school year. 

1 . Assessment using a mathonatics portfolio at the fourth grade was discontinued and in 
its place, a fifth grade mathematics portfolio was used. 

2. Whereas fiti mathematics portfolio had not been counted in the accountability index in 
the past, it counted in the 1994-1995 assessment. The State Board for Elementary and 
Second^ Education d^ded that the mathematics portfolio counted for 30% of the 
total math score 

3. The accountability grade in high school was grade 1 1 instead of grade 12. However, 
the portfolios in writing and mathematics will still be due in grade 12. 

4. The accountability grades were 4, 8, and 1 1, However, grade 12 students wae also 
assessed. This was done for purposes of equating scores from previous years to the 
1994-1995 year. 

5. Althou^ curriculum areas beyond the five mentioned previously were assessed in 
1993- lw4, they did not count in the accountability index. These areas were arts, 
humanities, and practi^ living/vocational studies, ^ese assessments will count in 
the second biennium (1994-1996). Currently, arts, humifies, and practical 
living/vocatiOTial stupes are assess^ in the scrimmage (practice) tests that can be 
administered in non-accountability grades. Performance events were also used for 
these areas in 1993-1994. 

In what may become the biggest change to the KIRIS assessment, a new RFP was issued by the 
KDE in late 1995. With this RFP, the KIRIS assessment will be redesigned, heading into the 
new millennium. Like the earlier KIRIS assessment, most cognitive areas will be covered with 
multiple modes of assessmoit; multiple choice, craistructed response portfolio and performance 
events. However, multiple choice questions may count on the KIRIS assessment for the first time, 
increasing file number of assessment modes that carry weight to determine a school’s 
accountability score. It is possible that the new KIRIS assessment will show improved reliability 
with both the individual student level and the sdhiool level scores. 

Intent of KERA: Shake Up the System 

KERA was a complex piece of legislation that accomplished more than simply mandating a 
performance assessment system. As envisioned by its supporters, KERA would create an 
environment for an inclusive system focusing on improved educational outcomes for school 
children. Borrowing fiom the African provob that "it t^es an aitire village to raise a child", 
teachers, principals, and parents would aU work together in the Site-Based Management Councils 
(SBMQ to improve instructiOT, help raise educators’ expectations concerning the wwk students 
were capable of doing, and (eventually) move the school to higher KIRIS scores. Inclusive models 
of iM'ofessional devdc^ment, and simultaneously involving many different stakdiolder groups in 
public education mi^t help make a difference in student outcomes. Cunrentty, inuch professional 
devdopment activity in the United States focuses on one group at a time, asking "what do the 
parents have to do to improve student outcomes", "what do the teachers have to do to improve 
student outcomes", or "what do the prindpals have to do to improve student outcomes". iTiese 
"one group at a time" efforts often wrak in isolation from the professional devdopment of other 
major stakeholder groups in public education. With the SBMC, Kentucky’s systemic reform 



initiative has a mechanism for stakeholder groups to work tog^her for improved education 
outwm^ at the local level. Takra together, the eight propositions that constitute the philosophy of 
KliRA (W^erson and As^ciates, Inc. 1994) were designed to shake up public education in 
Kentucky. The eight propositions are: 

A. All children can learn at a rdatively high level. 

B. The state should set high standards of achievement for all children. 

C. More learning resources should be focused on stud^ts who have not succeeded in 
meding the state's learning standards. 

D. Decisions affecting instruction can best be made '‘t the local school level. 

1^ priniary schools, students should not be labeled as belonging to a specific grade 

F. It is not enough to require that students show their knowledge of facts; they must also 
demonstrate that they can apply what they know in real life situations. 

G. Both rewards and sanctions are necessary to hold schools accountable for imorovinc 

student poformance. *’ 

H. ffigher performance levels by all children are important for economic growth of 
Kentucky. 



of thcj^o^sitions (A, B, D, F and H) were taken from the systemic education reform 
“Stead of talking about systemic educational reform, Kentucky 
ovCThauled Its K-12 pubhc education, installed new assessments, introduced an accountability 
^stem, and made a major financial commitment to professional development of educators. KERA 
was passed m response to a court order and a perceived crisis in the quality of public education in 
the Commonwedth. KBUS was bom out of frustration with an inaemental a^oach to reform 
th^ never seemed to change the status quo (Haertel 1994). The intent of KERA was to shock the 
nSSby a^4)^ educational business as usual, and to set a new course. As 



recognized that new tests alone would not be enough and 
embedd^ mis m a comprehensive reform package. Educational funding was 
maeasM dramatically to pay for additional teacher training, new educational 
maten^s, and other needed changes. In addition, the legislature deserves credit for 
recognmng that ch^ge will take time. Many promising educational innovations 
nave b^ undermmed by an insistence on quick results. KIRIS features a 
r^sonable ptosed implementation and a realistic 20-year time line for attaining its 
mhmate goals. Along the same lines, the Kaitucky reforms take account of the 
ditf ?rent starting points of various schools, differentiating improvement targets 
accc rdmg to mitial achievemrait levels. ^ 



Impact of the KIRIS assessment on education in Kentucky 

for CTtremely broad and sweeping educational reforms. It has been six years 

implementation of the KIRIS assessment. 
ProponOTts of ^RA reasoned assessment would ^ve curriculum". Educators (teachers 

ignore the Kentucky reform effort because of the sanctions 
wmponent of the KIWS assessment. If educators ignored the reform effort by conducting 
busings as usiral, and KIRIS assessment scores went down, the schools where those educators 
wOTk^ would be subject to sanctions. If the school was subject to the most severe sanction the 

^®»chers and principals might lose their jobs. However, if educators 

® assessment scores went up, 

the State would be satisfied KERA mandat^ that assessment scores increase. Teachers with the 
help of prmapals and site based management councUs, were responsible for providing instruction 







BlS'I gopv available 



that would increase assessment scores. Neither KERA, nor the KDE has mandated a teaching style 
or a structured curriculum. 



Proponents of KERA thought the rewards and sanctions coini»nent of the KIRIS 
assessment was required for educators to take to reform seriously. 

KERA thought that the "stakes" associated with the assessmait would play a crucial componm m 
the systemic reform initiative, driving instruction and curriculum tow^ds tosks str^sed on lUWS. 
With the reforms in place for six years, it may be useful to d^ermme the impact ot the KiKib 
^Ssment in reforimng education in Kentudcy. Qearly, long term impacts of the assessment 
system may not be n<^i^ for many years. 



Positive Effects of the Assessm^t 

1. concentp itinn on students' writing. As envisioned by 

KIRIS system of rewaiT^- and sanctions is supposed to motivate principals, teeners, ate- 
based management councils to alter the instructional curriculum pres^t^ to students. It h^ 
argued in Kentucky that "assessment drives curriculum." The KIRIS ass^snient is heavily 
oriented toward writing. The biggest weight on the KIMS assessment is assigned to the oj^- 
aid^ constructed response questions. The second biggest weight is assigned to the wntmg 
portfolio In the five cognitive areas covered in the first biennium, writmg m one form or another 
a^unted for 88 percent of the weight on the KIMS assessinent. Counting performan^ evrats ^ 
writing (students have to write their responses to situations) increases the weight of wntmg to lOu 
percent of the KIRIS assessmait. Evai in subjects like mathematics, studrats m^t wnte answas 
to questions. Students who do not write wdl will not do well on the KIMS assessmait, 
irr^pective of their knowle^e of subject matters like mathematics. Such a heai^r weight on 
writing on the assessment may have made it easier for educators to stress ^bng whai t^c^g. 
StudCTts do report more writing under the reforms (Coe, Leopold, Simon, & Williams, 1 W4). 



2 Improvement in studaits' writing quality . Scores on the writing portfolios h^e in^iroved 
since the baseline KklS assrasment in 1991-1^^ Teachers, District Assessment CoordinatOTS, 
and superintendents rqxMt almost unanimoudy that writing has improved; the wrj^g 
improvemoit was over and above what would have been expected of most schwl children ot the 
same age. We believe ^ut we cannot be sure) that the reported increase and im^vment m 
studaits' writing is due to the heavy weight on writing on the KIMS assessmait an^ intimately, 
the prospect of rewards and sanctions based on KIMS assessment results. It is a limitation of tins 
papa that we did not ^ther and study student portfolios and other evidence to assess whether the 
quality of students' writing has actually improved. 



3 Involvement of students in cooperative p roblem-solving. Poformance events add a group 
component to the KIMS assessment. However, performance events accounted fw 12 P^c^t ot 
the Wright of the KIMS assessment in the first biennium. Even with a relaUvdy hght waght, the 
performance events (group activities involving problem solving or experimentation) meamngtully 
engage students. Consistent with the 1994 revised l^slative requirements, they yield expenen^ 
but not assessments of the ability of individual students to work productively and coUab^tivriy m 
groups and p^orm important learning targets. Students reported increased group work smee the 
passage of the KERA (Coe, Leopold, Simon, & Williams, 1994). 



4 , Instructional contribution of portfolios . Another benefit of the KIMS assessment has been 
the development of portfolios. Portfolios of students* mathematics and written wo A appe^ to 
have great instructional potential. Students have to choose their best work to put m the porttolio, 
and tWs gives students a chance to reflect upon their intellectual growth over a school year. A 
more passive assessment system, like a test consisting of 1(X) percent multiple-choia que^ons, 
does not meaningftilly engage the student in the same way. Choosing pieces for mclusion in the 
portfolio engages the student more, therriiy increasing the student's involvement m deadmg the 




material on which the student is to be assessed. Fot teacha’s, evaluating the best work of their 
students gives than critical feedback that they may use in deciding what instructional materials and 
assistance would best serve the oitire class as weU as individual students. 

5. Inaeased Attention to Students with Special Needs . The limited evidoice we gathered with 
respect to special populations showed that the educational system was paying considerably mwe 
attention to these groups than before the KIRIS system was set up. Sp^al education students 
always take the K&IS assessment unless they are severely handicapped. A severely handicapped 
student is assessed using an altonative portfolio. Some educators rqported that the educational 
system was paying more attention to special education students bemuse such students were 
included in the accountability system. Nearly 99.5% of students in the accountability grades are 
assess^ in one form or another by KIRIS. Many (if not most) other statewide assessment 
programs do not include special education students, and other students with disabilities. It is 
possible that school systons would take the difficult job of educating diildren with disabilities 
more seriously if such students were included in an assessment program. If KDE can document 
that special education students are receiving inaeased instructional attoition because of KIRIS, this 
evidoice would support the consequential validity of the accountability index. 

6. The Sanctions Ctomponoit of KIRIS May Have Provided Members of the State Legislature 
With Some Protection Against Retaliation from Qinstituents after supporting a Big Tax Increase . 
Thae is some evidence from national public opinion poUs that the public will tolerate large tax 
increases if the money raised from such a tax does not "go into a black hole" (Johnson and 
Immerwahr, 1994). In otlier worc^ if the public feels the ad^tional money raisai by a tax increase 
is not wasted, politicians supporting a tax increase are less likely to face retaliation at the voting 
booth in future elections. The accountability provisions of KIRIS may have provided members of 
the state legislature who voted for the tax increase to fund KERA a buffer against constituents 
generally opposed to tax increases. Members of the State Legislature can claim "if the money is 
wasted, the accountability scores will not improve, and sanctions will (ultimately) take care of 
people who are not producing". A number of members of the state legislature voting for KERA 
i^.ve been targeted by opposition groups for defeat. However, no member of the state legislature 
targ.'^eu for defeat by anti-KERA forces has (yet) been defeated where the KERA vote tas bear 
seen as a key component of that defeat. 

The KIRIS accountability system aeated a visible means of public accountability for the 
school system. Like the football coach who finds his won-lost record publicized, the accountability 
scores for all schools in the Commonwealth are published in the two newspapers with statewide 
circulation and many local papers. Sudi an accountability system allows membm of the state 
legislature and other citizois to know if schools in their community earned rewards or face 
sanctions. The rewards (and sanctions) given to schools are highly symbolic achievements that 
may reflect positively (or negatively) on an entire community, ^us, when the state grants awards 
to a particular school, they are not only giving employees a small cash bonus, they are giving a 
"pat on the back" to an entire community. 



Ne gative Effects of the Assessment 

1. The KIRIS assessment has limited use for assessing the educational progress of individual 
children . Parents (typically) want to know how their child is doing in school in relation to the 
child's own capabilities and sometimes in relation to peer groups in (^er communities or in other 
states. KIRIS is a school based accountability measure, looking at grade-on-grade dianges in a 
school and does not track individual students over time. The current reliabilities (on die K^S 
assessment) "are not sufficiently high to make student levd decisions without additional 
information" (Kentucky Dqiartment of Education 1994). It is possible that the KDE is moving to 
address the issue of inadequate student levd rdiability in future years of the KIRIS assessment. 




11 



BEST COPY AVAILABLE 



The RFP issued in October 1995 will make it easier to create an assessment with improved 
student levd reliabilities. 

2. The corruptibility of high stakes assessment . Madaus (1988) notes that high s^es test 
scOTCS may become the most important goal of education. Haertel (1994) pointed out an important 
component of education is lost when teachers and studaits work for higher scores on the KIRIS 
accountability index instead of the intellectual attainments an increasing accountability score is 
supposed to represent. With any recurrent high-stakes assessment, a tradition of past examinations 
develops, and, over time, examiners become reluctant to make significant changes from year to 
year because then teachers will not know what to teach. If that happens, the domain of assesr«ment 
tasks wUl grow too narrow. Assessment scm*es will rise, but instruction wiU becorne stereotyped. 
As a consequence, students who have learned to do well on the particular kinds of items included 
in the assfissment may do poorly on equally valid items that are assessing the same skills in a 
slightly different way. 

3. The heavy concaitration on writing and group work may detract from the developm^t of 
nther slrills. esnedallv basic ddlls . Resnick and Resnick (1991) have pointed out "you get what 
you assess and don't g^ what you don't assess". At the presait time, the KIRIS assessment is 
oriwited almost delusively toward writing. The increase in KIRIS scores observed in the first 
three years of the program may reflect the heavy weight given to writing. Otho- skills are assessed 
in the KIRIS, but subsumed within writing. Alternative assessments of any type (authentic or 
multiple choice) stressing non-writing skills may show declines. 

As mentioned earlier, the high stakes component of the KIRIS assessment is supposed to 
alter the way teachers teach. By stressing "higha- order" cognitive skills (written res^nses to 
questions) it was thought that teachers would no longer "drill and kill" student to memorize bits 
and pieces of frag loited information. Examples of such bits and pieces of infomiation wouW be 
memorizatitxi of multiplication tables and ^^ng. It was thought that "drill and kill" instructional 
strategies not tmly turn students off to learning, but focuses learning on a narrow set of skills that 
do not easily generalize to acquiring other information. Assessment like CI BS-IV and KEST may 
create in the minds of teachers the importance of bits and pieces of fragmented information in the 
curriculum. The reformers reasoned that teadiers could not ignore fce authentic assessment if 
stakes were associated with part of the system. However, the creation of the new assessment 
system presented at least two challenges to teachers, what to teach and when to teach it. 

Teachers have a limited (and fixed) amount of time for instruction over an acad^ic year. 
In that limit^ and fixed amount of time, teadiers have to dedde how to allocate ther time to 
achieve instructional aims. Teachers in Kentucky, especially those teaching in the accountability 
years, need to allocate some of their instructional effort in teaching students to respond to essay 
questions and prqiaring students to develop portfolios. Teacho-s can turn any of the higher order 
cognitive skills assessed by KIRIS into a medianical process by rqpeatedly teaching students the 
method of resptMiding to essay questions. If teachers spend weeks (or months) training studoits to 
answer es^y questions, it is possible that such students will not have learned enough information 
to master the content of a given subject. Such students may be able to master the process of doing 
wdl on the accountability assessment (KIRIS) and some of the higher-order thinking skills 
stressed by KIRIS, but pOTorm poorly on skills not stressed by KIRIS. The KIRIS assessment 
does not stress some basic skills, like mathematical calculatimi. 

Some evidence for a drop in basic skills comes from CTBS-IV test results fiom the largest 
school district in the Commonwealth of Kentudey. In a pre-test, post-test, quasi-experimental 
design, the pre-test was the averagepercentile rank of a grade of students in the Jefferson County 
Public School distrid bdore the KERA was passed (Campbell 1969, Cook and Campbdl 1979). 
The post-test was the average percentile rank of a grade of students in Jeflterson dtounty Public 
District after the passage of file KERA. All components tests on the CTBS-IV showed deaeases 




J ^ 'IC>' best COPY AVAILABLE 




FIGURE 1 



Sixth grade math ccmcnitatiozi CTOS- IV 




YXAR 



O 

ERIC 



13 



MATH9TH 



FIGURE 2 



Nintli grad© inatli ccropiita-tion C^BS-IV 



55 


1 ^ r \ 




■ \ 


50 


\ 




\ 


45 


\ 




\ 


40 


\. 




\ 


35 


N 

1 ^ 



1988 1989 1954 



YZAR 



in average percentile ranks after the passage of KERA, when compared to average percentile ranks 
before the passage of KERA. Decrea.sss ranged from 8 percentile points to over 20 percentile 
pomts. The two largest drc^s, math computation scotcs for the 6th and 9th grades are presented in 
Figures 1 and 2, respectively. Again, the KDE may be deal with this issue when KIRIS is 
revised to meet the requirements of the October, 1995 RFP. 

Ihe_ public Does Not Yet Tru st the Assessment . It is ^ical that stakdiolders question the 
vabdity of new methods to demonstrate accountability, uutial skepticism is an appropriate 
response because it allows stakdiolders to pose questions, seek rationales, and buy time until the 
new system provides data a^essing legitimate concons. This questioning d^onstrates that 
stakdiolders are taking an innovation seriously. Therefore, it is not surprising to find that 
stakdiolders have serious concerns about KJRIS. 

A recent study conducted by Wilkerson and Associates, Inc. (1994) for the Kentucky 
Institirte for Educational Researdi found that principals, coordinators/supervisors, teachers, school 
council parents, public school p^ents, and the general public all ranked student performance on 
the KIRIS as the measure least likely to provide a reliable indicatOT of student learning. These 
diverse constituencies had most confidence in the pacentage of students who finished high school. 
A study of the Kentudcy state legislature found that 44 percent of the responding legislators said 
the most common complaint mentioned by the public was that the KIRIS was an inaccurate 
measure of students' abilities (Horizon Research International, 1994). 

5- Oycrconc^tration of ass essment based on writing ability . The KIRIS assessment 
curraitly is heavily oriented toward evaluations based on writing. Students who have content 
knowledge of the discipline but lack adequate writing skills are precluded from doing wdl on the 
as^sment. Some examples of alternative kinds of assessments that could lessen the impact of 
wntmg on the assessment include oral communication, involving the giving of a speech; aeating a 
pre^tatira that simultaneously involves written and visual information, typically called 
multmedia Md usually done on a computer; and performing and fine arts. We note too that 
niuluplc-choicc items offer studrats who ladk adeejuate writing ability an altemative way to express 
the knowledge they have. Multiple-choice tests r^ire good reading s^s, however. 

6- E atfotos will i ncrease teacher workloads, producing increased stre.ss . Vermont 
implemaited a low stakes statewide portfolio assessment program during the 1991-92 school year 
Teachers involved in the Vermont portfolio assessment rqxwted portfolios as a worthwhile burderi 
(Koretz, StechCT, Klein, & McCiafffw, 1994). However, these same teachers roiorted that 
portfohos caused considerable stress. Koretz, Stecher, Klein, and McCaffrey (1994) report 

The pressures experienced by educators want beyond time demands. 

For example, more than half reported difficulty finding appropriate tasks. 
Educator s also reported feding stress because of their uncertainty about 
appropriate us^ of portfolio scores; the rapid implementation of the 
program; and inadequate, tar^, and inconsistent information fern the 
state. 

In focus groups held in Kentucky, we found that teachers thought that portfolios were 
beneficial to mstruction. However, Kentucky teachers reported the same concerns as their 
Vermont countaparts. Increased stress was rq^ed by nearly every teacher attending the focus 
groups. Additionally, a study of one Western Kentucky school district found that teacha- stress is 
extremdy high, approaching debilitating levels for many (Hughes & Craig, 1994). 

Portfolios increase teacher workloads. This increase in teacher workload occurred in both 
low stakes assessment programs (Vermont) and a middle stakes assessment program Kentucky). 




' ) 9 - 



BEST COPY AVAILABLE 



It is possible thit Kaitucky teachers experience greata job stress than Vermont teachers due to the 
high stakes Kentuc^ assessment. However, we have no data on this question. 

More General Problon: The PubUc Does N ot Yet Trust Reforms As8QCiate<t with Svstgmig 
Initiatives 

From national surveys, we can conclude that the public does not object to r^la^g 
multiple-choice tests with more authentic assessments (Johnson and Immerwahr, 1W4). 
Additionally, Johnson and Immerwahr (1994, P. 19) state 

Previous research by Public Agenda has suggested t^t large numbers of AmCTiems, 
like leaders, question the usefulness of multiple-choice piams and favor altemrov^ 
such as essay tests, portfolios and demonstration projects when they ^e u^ m 
conjunaion with grades. In this study, 54% of respondents say rqplacmg multiple- 
choice tests with essay tests would imi^ove acadonic performance-an endoreanOTt, 
but one that falls significantly short of people's support for removing ^sruptive 
students (73%) or making correct English a requirement for graduation (88%). 

The problan that education reformers face in their drive to rqilace multiple-choice 
tests with more authaitic forms of assessmait is not that people object to the idea. 

The prolan is that this particular recommendation seems somewhat tangential to 
people’s diief concerns about the schools. It is as if people are saying, "Well, th^t s 
all wdl and good, but what about the guns, the drugs, the truancy, and the students 
who can’t add, spell, or ^d France on a map? 

Systemic reformers, including leaders of the Kentucky reform movement , sem to be at 
odds with the public’s perce^on of what needs to be done in the schools. The pub^ has tmee 
major concemst order, discipline, and teaching the basics (Johnson and Immerwahr, 1994, Gallop 
Public (jpinion Poll, 1994). Systemic reformers stress higher order cognitive skills, amhOTtic 
assessment, enriched curriculum, and improved professional devdopment of educators (Cohen, 
1995) It seems that systemic reformers are not addressing the public’s major educatirad 
concerns. According to Yankelovich (1995) educational leaders must engage the public s 
preoccupation with order, discipline and teaching the basics. Even after addressing the pubhe s 
major concerns, leaders will ne^ a mount an awareness campaign to inerrase support for systomc 
reform initiatives. Such a campaign can take a few years before public opinion surveys show 
increasing public suppeat for the actions of educational leaders. In 19%, it is all to easy fOT 
poUticians or other leaders to cripple a systemic reform ioitiative because such imtiatives currentiy 
have (at best) very limited public support. 



Conclusion 

The KERA is one educational reform that has impacted instruction in the classroom for 
many students in Kentucky. The rewards and sanctions component of the assessment system 
made it very difficult for educators (teachers, prindpals, and superintendents) to ignore the reform 

act. 

The act was designed to shock the system and force change. In many respects the system 
was shocked and change forced upon educators, perhaps due to the sanctions of the KIWS 
assessment. However, shocking the system was a short-term strategy to move the system in a 
non-increment^ manner. The problem in Kentucky is now to move frorn a new system that 
promised non-incremental educational change to a system that can fine tune itself without ejdOT^ 
shocks. A committee of the state legislature came within one vote of oveniauling the KIRIS 



Er|c lb 




assessment ^stem in July 1995. It is not clear that the current system, sustained on rewards and 
sanctions, will last until 2012, th", supposed aid date of the program. 



O 



- IS- 



IV 



Acknowledgments 

We wish to thank the Kentucky Institute for Education Research, especially Roger Pankratz 
and NTila Weddle, for all of the help th^r have givai to the research team over the la^ seve« 
months. Without their help, this report would have suffered severe shortcomings. With their 
help, we have beat in a betto’ position to avoid factual and interpretive errors. 

We also want to thank the hundreds of other people throughout the Commonwealth of 
Kentucky who played a role in this study. Unfortunately, only a few of these people can be 
mentioned in a paper like this one 

During the qualitative data-gathering role, Robert Rodosky (Louisville), Duane Miller 
(Owensboro), Evonne Slusher (Bell County), and Joel Brown (Bowling Green) put together an 
impressive group of participants for the focus groups under a very tight timetable 

Many thanks to the Kentucky Association of Assessmait Coordinators (KAAC) for 
allowing Fenster to "invite himself to the May 20, 1994, meeting of the group. Becau^ of that 
meeting, we added to the basic methodology of the study and decided to said surveys directly to 
stakeholder groups. If Fenster had not been able to attend that meeting, the idea for the survey 
would not have materialized The surveys improved the quantity and quality of evidence presented 
in this r^rt. 

A special note of thanks to the 113 DACs and 70 superintendents who took the time from 
their busy schedules to answer an intensive questionnaire alMut their experiences with KERA and 
KIRIS. Without the time and effort of these people, the study would have been significantly 
weaker. 

We thank Edward Reidy of KDE and Richard Hill and Amy Sosman of ASME for taking 
the time from thdr busy schedules to provide documents and to repeatedly answer our tdephone 
questions on the KIRIS assessmoit. 

We also recognize the long and hard work put into the KIRIS assessment system by ASME 
and KDE. Performance assessments are not yet "commonly" used. The problems with these new 
kinds of assessments have not yet been worked out technically nor operationally. It would have 
been easy for ASME and KDE to go slowly when implemaitmg a new performance assessmait 
system. ASME and KDE took the tou^a- road, bypassed the transitional testing period, and 
implemented the legislatively mandated performance-based system immediatdy. 



REFERENCES 



Ameri^ Educational Research Association, American Psychological Association, & National 
Coimcil on Measurement in Education. (1985). Standards for educational and psychological 
testing . Washington, DC: American Psychologic^ Association. 

Campbdl, D. T. (1969). Reforms as Experimoits. American Psychologist . 24:409-429. 

Cohai, D. K. (995). What is the System in Systemic Reform? Educational Researcher . 24, No. 9, 
pp. 11-17, 31. 



Cook, T. D., & C!ampbell, D. T. (1979). (Juasi-Expaimentation: Design and Analysis for Field 
Settings. Cliicago, D: Rand McNally. 

Coe, P., Leopold, G., Simon, K., & Williams, J. (x994). Percq)tions of school changes: 
Interviews with Kqituckv studoits . A Report Submitted to the Kentucky Caucus of the 
Appalachia Educational Labraatory Board of Directors. Qiarleston, WV: AEL. 

Gallop Public Opinion Poll. (1994). Gallop Survey of the Public's Attitudes Toward Public 
Schools. National Phone Survey of 1, 326 adults. 

Guskey, T. R. (1994). High stakes performance assessment: Pa-spectives on Kentuckv '.s 
educational reform . Thousand Oaks, CA; Corwin Press. 

Haertel, E. H. (1994). Theoretical and Practical Implications. In T. R. Gusky (Ed.), High stakes 
performance assessment: PCTSpectives on Koitucky's educational reform , pp. 65-75. Thousand 
Oaks, CA; Corwin Press. 

Horizon Research International. (1994). A survey of legislators on Kcaituckv instructional results 
information system (KIRIS). Legislative Research Commission, Office of Education 
Accountability. Frankfort KY: Author 

Hughes, K,R., & Craig, J.R. (1994, November). Using performance assessment achievemait 
data to evaluate a prim ary instructional program . Paper presented at the annual me^g of the 
American Evaluation Association, Boston, MA. 

Johnson, J. and Immerwahr, J. (1994). First Things First What Americans Expect from the 
Public Schools . New York, NY: Public Agenda. 

Kentucky Dqartment of Education. (1994). Kentucky instructional results information system: 
1992-93 technical rqxMt . Frankfort, KY: Author. 

Kentucky Dq>artment of Education. (1993). Kentucky instructional results information sv.stem! 
1991-92 technical rqx>rt . Frankfort, KY: Author. 

Koretz, D., Stecher, B., Klein, S., & McCaffrey, S. (1994). The Vermont portfolio assessment 
program: findings and implications. Educational Measurement: Issues and Practices . 12(4), pp. 
5-16. 



Linn, R. L. (1994). Paformance assessment: Policy promises and technical measurement 
standards. Educational Researcher . 22(9), pp. 4-14. 



Madaus, G. F. (1988). The Influence of Testing on the Curriculum. In L. N. Tanner (ed), CriM 
Issues in Curriculum (Eighty-seventh yearbook of the National Soci^ for the Study of Education, 
pt. 1. pp. 83-121). Chicago: University of Chicago Press. 

Resnick, L.B. and Resnick, D. P. (1991). Assessing the Thinking Curriculurn: New Tools for 
Educational Reform, in B. G. Gifford and M. C. Conner (Eds.) Chan g in g Asscssh^gn^ 
Alternative Views of Aptitude. Achievement and Instruction , (pp. 37-75). Boston; Kluwer 
Academic Publishers. 

Rose V. Council for Better Education, Inc. KY 88-SC-804-TG (September 28, 1989). 

Wilkerson & Associates, Ltd. (1994). Statewide edu cation reform survey. I.x>uisville, KY; 
Author. 

Yankelovich, D. (1995). The Crisis in Education; Perepertives from the Public. Oral Presentation 
given at the EducadtHial Policy Institute, Oakland University, S^ember 7-8. 








‘I 



